Hands-on_Ex03

Author

Dew Stella Chan

Published

January 25, 2025

Modified

February 1, 2025

3.1 Hands-on Exercise

Programming Interactive Data Visualisation with R

The learning outcome of this exercise is to learn how to create interactive data visualistaion by using functions in ggiraph and plotlyr packages. The material of this page is referred to Prof Kam’s webpage allocated for Hands-on Exercise 03.

Loading of Libraries

  • ggiraph package makes ‘ggplot’ graphics interactive.

  • plotly is for plotting interactive statistical graphs.

  • DT provides an R interface to the JavaScript library DataTables that create interactive table on html page.

  • tidyverse is part of the family of modern R packages specially designed to support data science, analysis and communication task including creating static statistical graphs.

  • patchwork is for combining multiple ggplot2 graphs into one figure.

pacman::p_load(ggiraph, plotly, 
               patchwork, DT, tidyverse)

Importing Data

exam_data <- read_csv("data/Exam_data.csv")

Interactive Data Visualisation - ggiraph methods

ggiraph is an htmlwidget and a ggplot2 extension. It allows ggplot graphics to be interactive.

Interactive is made with ggplot geometries that can understand three arguments:

  • Tooltip (data label): a column of data-sets that contain tooltips or data label to be displayed when the mouse is over elements.

  • Onclick: a column of data-sets that contain a JavaScript function to be executed when elements are clicked.

  • Data_id: a column of data-sets that contain an id to be associated with elements. If it used within a shiny application, elements associated with an id (data_id) can be selected and manipulated on client and server sides.

  • Refer to this article for more detail explanation.

Tooltip effect with tooltip aesthetic

Below shows a typical code chunk to plot an interactive statistical graph by using ggiraph package. Notice that the code chunk consists of two parts. First, an ggplot object will be created. Next, girafe() of ggiraph will be used to create an interactive svg object.

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = ID),
    stackgroups = TRUE, 
    binwidth = 1, 
    method = "histodot") +
  scale_y_continuous(NULL, 
                     breaks = NULL)
girafe(
  ggobj = p,
  width_svg = 6,
  height_svg = 6*0.618
)

By hovering the mouse pointer on an data point of interest, the student’s ID and class will be displayed.

Customisation of Tooltips

If the desired Tool Tips or Data labels consisted of multiple fields, they can be customised using a few lines of codes with formatting so it will be shown up according to desired.

# Cutomisation of Tooltips
exam_data$tooltip <- c(paste0(     
  "Name = ", exam_data$ID,         
  "\n Class = ", exam_data$CLASS)) 

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = exam_data$tooltip), 
    stackgroups = TRUE,
    binwidth = 1,
    method = "histodot") +
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(
  ggobj = p,
  width_svg = 8,
  height_svg = 8*0.618
)

Interactivity

Customisation Tooltip style

Code chunk below uses opts_tooltip() of ggiraph to customize tooltip rendering by add css declarations.

Notice that the background colour of the tooltip is black and the font colour is white and bold.

tooltip_css <- "background-color:white;
font-style:bold; color:black;" 

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(tooltip = ID),                   
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(    #<<
    opts_tooltip(    #<<
      css = tooltip_css))
)                     

Displaying statistics on tooltip

Code chunk below shows an advanced way to customise tooltip. In this example, a function is used to compute 90% confident interval of the mean. The derived statistics are then displayed in the tooltip.

tooltip <- function(y, ymax, accuracy = .01) {
  mean <- scales::number(y, accuracy = accuracy)
  sem <- scales::number(ymax - y, accuracy = accuracy)
  paste("Mean maths scores:", mean, "+/-", sem)
}

gg_point <- ggplot(data=exam_data, 
                   aes(x = RACE),
) +
  stat_summary(aes(y = MATHS, 
                   tooltip = after_stat(  
                     tooltip(y, ymax))),  
    fun.data = "mean_se", 
    geom = GeomInteractiveCol,  
    fill = "light blue"
  ) +
  stat_summary(aes(y = MATHS),
    fun.data = mean_se,
    geom = "errorbar", width = 0.2, size = 0.2
  )

girafe(ggobj = gg_point,
       width_svg = 8,
       height_svg = 8*0.618)

Hover effect with data_id aesthetic

Interactivity: Elements associated with a data_id (i.e CLASS) will be highlighted upon mouse over.

Noted that in the code argument, ” data_id= CLASS” so that students in the same class are being highlighted. In the earlier example, “tooltips = ID” is used to the student ID will be shown.

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(           
    aes(data_id = CLASS),             
    stackgroups = TRUE,               
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618                      
)

Styling hover effect

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(data_id = CLASS),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(                        
    opts_hover(css = "fill: #202020;"),  
    opts_hover_inv(css = "opacity:0.2;") 
  )                                        
)          

Different from previous example, in this example the ccs customisation request are encoded directly.

Combining tooltip and hover effect

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(tooltip = CLASS, 
        data_id = CLASS),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(                        
    opts_hover(css = "fill: #202020;"),  
    opts_hover_inv(css = "opacity:0.2;") 
  )                                        
)             

Exam data being plotted using other functions using ggiraph package.

Note

The code chunk below plotted the exam data for Maths and science in a scatter plot. Data points/ students in the same CLASS will be highlighted upon mouse over. At the same time, the tooltip will show the student and Class highlighted the students from the same class. The gender are plotted in different colours to allow viewers to make quick sense of data maths and science and maths scores in each class and if there are any gaps in gender.

Based on the data below, it is observed that students from 3A has higher score in Maths and Science compared to students from class 3I.

gg_scatter <- ggplot(
  data = exam_data,
  mapping = aes(
    x = MATHS, y = SCIENCE, color = GENDER,
    # here we add iteractive aesthetics
    tooltip = exam_data$tooltip, data_id = CLASS #using the tooltip or data label derived earlier part of the Hands-on Exercise
  )) + geom_point_interactive(
    size = 3, hover_nearest = TRUE)

css_default_hover <- girafe_css_bicolor(primary = "yellow", secondary = "red")

set_girafe_defaults(
  opts_hover = opts_hover(css = css_default_hover),
  opts_zoom = opts_zoom(min = 1, max =200 ),
  opts_tooltip = opts_tooltip(css = "padding:3px;background-color:#333333;color:white;"),
  opts_sizing = opts_sizing(rescale = TRUE),
  opts_toolbar = opts_toolbar(saveaspng = FALSE, position = "bottom", delay_mouseout = 5000)
)
girafe(ggobj = gg_scatter)

Click effect with onclick

onclick argument of ggiraph provides hotlink interactivity on the web.

The code chunk below shown an example of onclick.

exam_data$onclick <- sprintf("window.open(\"%s%s\")",
"https://www.moe.gov.sg/schoolfinder?journey=Primary%20school",
as.character(exam_data$ID))

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(onclick = onclick),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618) 

Coordinated Multiple Views with ggiraph

Coordinated multiple views methods has been implemented in the data visualization below

p1 <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +  
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

p2 <- ggplot(data=exam_data, 
       aes(x = ENGLISH)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") + 
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

girafe(code = print(p1 + p2), 
       width_svg = 6,
       height_svg = 3,
       options = list(
         opts_hover(css = "fill: #202020;"),
         opts_hover_inv(css = "opacity:0.2;")
         )
       )
Note

Notice that when a data point of one of the dotplot is selected, the corresponding data point ID on the second data visualisation will be highlighted too.

In order to build a coordinated multiple views as shown in the example above, the following programming strategy will be used:

Appropriate interactive functions of ggiraph will be used to create the multiple views. patchwork function of patchwork package will be used inside girafe function to create the interactive coordinated multiple views.

Interactive Data Visualisation - plotly methods

Plotly’s R graphing library create interactive web graphics from ggplot2 graphs and/or a custom interface to the (MIT-licensed) JavaScript library plotly.js inspired by the grammar of graphics. Different from other plotly platform, plot.R is free and open source.

There are two ways to create interactive graph by using plotly, they are:

by using plot_ly(), and by using ggplotly()

plot_ly(data = exam_data, 
             x = ~MATHS, 
             y = ~ENGLISH)

Working with visual variable: plot_ly() method

In the code chunk below, color argument is mapped to a qualitative visual variable (i.e. RACE).

plot_ly(data = exam_data, 
        x = ~ENGLISH, 
        y = ~MATHS, 
        color = ~RACE)

Add on from Hands on : Interactive plot with Gender.

plot_ly(data = exam_data, 
        x = ~ENGLISH, 
        y = ~MATHS, 
        color = ~GENDER)

####creating an interactive scatter plot: ggplotly() method

p <- ggplot(data=exam_data, 
            aes(x = MATHS,
                y = ENGLISH)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))
ggplotly(p)

Coordinated Multiple Views with plotly

The creation of a coordinated linked plot by using plotly involves three steps:

highlight_key() of plotly package is used as shared data. two scatterplots will be created by using ggplot2 functions. lastly, subplot() of plotly package is used to place them next to each other side-by-side.

d <- highlight_key(exam_data)
p1 <- ggplot(data=d, 
            aes(x = MATHS,
                y = ENGLISH)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

p2 <- ggplot(data=d, 
            aes(x = MATHS,
                y = SCIENCE)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))
subplot(ggplotly(p1),
        ggplotly(p2))

Interactive Data Visualisation - crosstalk methods!

Interactive Data Table: DT package A wrapper of the JavaScript Library DataTables

Data objects in R can be rendered as HTML tables using the JavaScript library ‘DataTables’ (typically via R Markdown or Shiny).

DT::datatable(exam_data, class= "compact")

Code chunk below is used to implement the coordinated brushing shown above.

d <- highlight_key(exam_data) 
p <- ggplot(d, 
            aes(ENGLISH, 
                MATHS)) + 
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

gg <- highlight(ggplotly(p),        
                "plotly_selected")  
 
crosstalk::bscols(gg,               
                  DT::datatable(d), 
                  widths = 5)  

Things to learn from the code chunk:

highlight() is a function of plotly package. It sets a variety of options for brushing (i.e., highlighting) multiple plots. These options are primarily designed for linking multiple plotly graphs, and may not behave as expected when linking plotly to another htmlwidget package via crosstalk. In some cases, other htmlwidgets will respect these options, such as persistent selection in leaflet.

bscols() is a helper function of crosstalk package. It makes it easy to put HTML elements side by side. It can be called directly from the console but is especially designed to work in an R Markdown document. Warning: This will bring in all of Bootstrap!.

3.2 Hands-on Exercise

Programming Animated Statistical Graphics with R

The 2nd part of the exercise is use animated graphic to attract the interest of the viewer and leave a deeper impression as compared to the static charts and graphs.

Loading the R packages

  • plotly, R library for plotting interactive statistical graphs.

  • gganimate, an ggplot extension for creating animated statistical graphs.

  • gifski converts video frames to GIF animations using pngquant’s fancy features for efficient cross-frame palettes and temporal dithering. It produces animated GIFs that use thousands of colors per frame.

  • gapminder: An excerpt of the data available at Gapminder.org. We just want to use its country_colors scheme.

  • tidyverse, a family of modern R packages specially designed to support data science, analysis and communication task including creating static statistical graphs.

pacman::p_load(readxl, gifski, gapminder,
               plotly, gganimate, tidyverse)
col <- c("Country", "Continent")
globalPop <- read_xls("data/GlobalPopulation.xls",
                      sheet="Data") %>%
  mutate_each_(funs(factor(.)), col) %>%
  mutate(Year = as.integer(Year))
col <- c("Country", "Continent")
globalPop <- read_xls("data/GlobalPopulation.xls",
                      sheet="Data") %>%
  mutate(across(col, as.factor)) %>%
  mutate(Year = as.integer(Year))

Animated Data Visualisation: gganimate methods

ggplot(globalPop, aes(x = Old, y = Young, 
                      size = Population, 
                      colour = Country)) +
  geom_point(alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(title = 'Year: {frame_time}', 
       x = '% Aged', 
       y = '% Young') 

Building the animated bubble plot

ggplot(globalPop, aes(x = Old, y = Young, 
                      size = Population, 
                      colour = Country)) +
  geom_point(alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(title = 'Year: {frame_time}', 
       x = '% Aged', 
       y = '% Young') +
  transition_time(Year) +       
  ease_aes('linear') 

Animated Data Visualisation: plotly

gg <- ggplot(globalPop, 
       aes(x = Old, 
           y = Young, 
           size = Population, 
           colour = Country)) +
  geom_point(aes(size = Population,
                 frame = Year),
             alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(x = '% Aged', 
       y = '% Young')

ggplotly(gg)
gg <- ggplot(globalPop, 
       aes(x = Old, 
           y = Young, 
           size = Population, 
           colour = Country)) +
  geom_point(aes(size = Population,
                 frame = Year),
             alpha = 0.7) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(x = '% Aged', 
       y = '% Young') + 
  theme(legend.position='none')

ggplotly(gg)

Building an animated bubble plot: plot_ly() method

bp <- globalPop %>%
  plot_ly(x = ~Old, 
          y = ~Young, 
          size = ~Population, 
          color = ~Continent,
          sizes = c(2, 100),
          frame = ~Year, 
          text = ~Country, 
          hoverinfo = "text",
          type = 'scatter',
          mode = 'markers'
          ) %>%
  layout(showlegend = FALSE)
bp